Daily-Level GAM Analysis of Monarch Butterfly Abundance: Results Summary

Author

Kyle Nessen

Published

September 30, 2025

1. Motivation and Background

Why a Daily-Level Analysis?

I was motived to do this analysis after a conversation with Jessica Griffiths, who was not completely convinced by the results of our 30-min analysis. She cited observing butterflies relocating at Pismo to the leeward side of trees, which I have also observed. I think we should also take Leong’s observations in good faith that there is at least some anecdotal evidence of monarchs responding to wind. So I took a very similar approach to the lagged analysis we did before, but over days instead of 30 minutes. To calculate the change from the previous day (t-1) to the current day (t), I take the 95% percentile of the count for those days (more on that below).

Why Use Peak Counts?

Rather than analyzing counts at arbitrary time points, I use the 95th percentile of each day’s counts. This approach:

  • Captures the maximum cluster size when butterflies are present
  • Controls for measurement timing uncertainty
  • Provides a robust measure less sensitive to outliers than the raw maximum
  • If butterflies emigrated, the peak count would be markedly lower

The 24-Hour Time Window

I examine previous day conditions affecting next-day abundance because:

  • This represents a plausible biological response window
  • Longer time horizons introduce uncertainty—I cannot track individual butterflies, only count populations
  • Beyond 24 hours, I lose confidence that I’m observing the same population that experienced the initial conditions

2. Data Structure

Dataset characteristics:
- Observations: 103 day-pairs
- Deployments: 7 
- Date range: 2023-11-19 to 2024-02-03 
- Temporal structure: Consecutive days within deployments

2.1 Response Variable Selection

Response Variable: Square-root transformed difference in 95th percentile butterfly counts between consecutive days (butterfly_diff_95th_sqrt)

Selecting an appropriate response variable is critical for valid statistical inference. I systematically evaluated multiple candidate response variables and transformations to identify the distribution that best meets model assumptions.

Response Variable Candidates

I considered three measures of daily abundance change:

  1. Difference in maximum counts: max_butterflies_t - max_butterflies_t_1
  2. Difference in 95th percentile counts: butterflies_95th_percentile_t - butterflies_95th_percentile_t_1 (more robust to outliers)
  3. Difference in mean of top 3 counts: butterflies_top3_mean_t - butterflies_top3_mean_t_1

For each candidate, I tested two transformations:

  • Original: Untransformed difference
  • Square root: Sign-preserving square root transformation

Normality Assessment

I assessed normality using skewness, kurtosis, and the Shapiro-Wilk and Anderson-Darling tests to evaluate symmetry, tail behavior, and overall distributional fit.

A composite normality score (0-1 scale) was calculated weighting all criteria, with higher scores indicating better normality.

Results: Normality Comparison

Figure 1. Normality assessment for candidate response variables and transformations. Each panel shows the distribution (histogram) with a normal curve overlay (red line). Rank indicates overall normality score, with Rank 1 being most normal. The normality score combines skewness, kurtosis, and formal normality tests.

Selected Response Variable

Winner: butterfly_diff_95th_sqrt (square-root transformed difference in 95th percentile counts)

3. Candidate Predictor Variables

I considered numerous potential environmental predictors characterizing full-day weather patterns. All predictors are the weather conditions of the previous day (t-1), as those are the conditions the butterflies are experiencing. This analysis sees if there is a durable change the following day (t). The table below describes all variables initially considered for model inclusion, though not all were used (more below).

3.1 Butterfly Abundance Metrics (Previous Day)

Similar to the 30-min analysis, these metrics are there to help control for differences in cluster sizes. Follows the response variables.

Butterfly abundance metrics from previous day
Variable Description Units Role
max_butterflies_t_1 Maximum count observed on previous day count Baseline
butterflies_95th_percentile_t_1 95th percentile of counts on previous day (baseline population measure) count Baseline
butterflies_top3_mean_t_1 Mean of top 3 counts on previous day count Baseline

3.2 Sun Exposure Metrics (Previous Day)

I only considered one sun exposure metric, which was the sum of all butterflies in direct sunlight. I suppose I could have tried maximum, but this felt like a better proxy for the day.

Sun exposure metrics from previous day
Variable Description Units Role
sum_butterflies_direct_sun_t_1 Sum of all butterflies observed in direct sunlight throughout previous day (proxy for sun exposure) cumulative count Predictor

3.3 Temperature Metrics (Previous Day)

Because we are considering a whole day, I generated several reasonable temperature metrics. I do not use all of them, however.

Temperature metrics from previous day
Variable Description Units
temp_max_t_1 Maximum temperature during daylight hours (6am-6pm) °C
temp_min_t_1 Minimum temperature during daylight hours °C
temp_mean_t_1 Mean temperature during daylight hours °C
temp_at_max_count_t_1 Temperature at the time when maximum butterfly count was observed °C
hours_above_15C_t_1 Number of hours with temperature ≥15°C (activity threshold) hours
degree_hours_above_15C_t_1 Cumulative degree-hours above 15°C (thermal energy available for activity) degree-hours

3.3 Wind Metrics (Previous Day)

Again, in order to capture the story of the whole day, I generated as many reasonable wind metrics as I could think of. Not all were used.

Wind metrics from previous day
Variable Description Units
wind_avg_sustained_t_1 Mean sustained wind speed during daylight hours m/s
wind_max_gust_t_1 Maximum wind gust speed recorded (strongest wind event) m/s
wind_gust_sum_t_1 Sum of all gust speed measurements (cumulative wind exposure) m/s
wind_gust_sum_above_2ms_t_1 Sum of gust speeds when wind >2 m/s (threshold-weighted exposure) m/s
wind_gust_hours_t_1 Integral of gust speed over time (gust-hours of exposure) gust-hours
wind_minutes_above_2ms_t_1 Number of minutes with wind ≥2 m/s (duration above activity threshold) minutes
wind_gust_sd_t_1 Standard deviation of gust speeds (wind variability/gustiness) m/s
wind_mode_gust_t_1 Most frequent gust speed in 0.5 m/s bins (typical wind condition) m/s

3.4 Correlation Matrix: All Candidate Predictors

This correlation matrix shows all potential fixed effects initially considered. I manually then reviewed this table and made decisions on which variables to include the model runs.

For previous day counts, I simply followed the chosen response variable, 95% percentile. Sun exposure only had the one option.

For temperature, there were several highly correlated pairs. I ultimately decided on temp_min, temp_max, as useful predictors given the thermoregulation of monarchs, and also temp_at_max_count, because it was not particularly correlated with anything else, and ‘temp_mean’ was so close to the max temperature.

Wind was surprisingly correlated. wind_gust_sd did stand out as a variable which avoided the worst of the correlation with other predictors, but not wind_max_gust, which has a correlation of 0.76. I ultimately chose wind_max_gust because it seems to reasonbly capture all other wind metrics (except wind_mode_gust, which I don’t know how to interpret), and directly matches our previous analysis.

4. Final Model Variables

After correlation analysis, the following variables were selected for model testing to minimize multicollinearity while maintaining biological interpretability:

Variables included in final models
Variable Role Description
butterfly_diff_95th_sqrt Response Square-root transformed change in 95th percentile count between consecutive days
butterflies_95th_percentile_t_1 Baseline (B models only) Previous day’s 95th percentile count (baseline population size)
temp_max_t_1 Predictor Previous day’s maximum temperature (°C)
temp_min_t_1 Predictor Previous day’s minimum temperature (°C)
temp_at_max_count_t_1 Predictor Temperature when peak count occurred on previous day (°C)
wind_max_gust_t_1 Predictor Previous day’s maximum wind gust speed (m/s)
sum_butterflies_direct_sun_t_1 Predictor Sum of butterflies in direct sun on previous day

Correlation Matrix: Final Model Variables

This correlation matrix shows only the variables ultimately included in model testing, representing a reduced set with lower multicollinearity:

Correlation coefficients for final model variables
butterfly_diff_95th_sqrt butterflies_95th_percentile_t_1 temp_max_t_1 temp_min_t_1 temp_at_max_count_t_1 wind_max_gust_t_1 sum_butterflies_direct_sun_t_1
butterfly_diff_95th_sqrt 1.000 -0.389 -0.112 -0.042 0.145 0.193 -0.072
butterflies_95th_percentile_t_1 -0.389 1.000 -0.146 -0.299 -0.132 -0.211 0.442
temp_max_t_1 -0.112 -0.146 1.000 0.173 0.215 -0.334 0.016
temp_min_t_1 -0.042 -0.299 0.173 1.000 0.351 0.210 -0.331
temp_at_max_count_t_1 0.145 -0.132 0.215 0.351 1.000 -0.116 0.098
wind_max_gust_t_1 0.193 -0.211 -0.334 0.210 -0.116 1.000 -0.122
sum_butterflies_direct_sun_t_1 -0.072 0.442 0.016 -0.331 0.098 -0.122 1.000

5. Modeling Approach

5.1 Two Model Families: Absolute vs. Proportional Effects

I tested 105 models in two distinct families to distinguish different mechanisms by which environment affects abundance:

M Models (n=50): Absolute Effects

  • Do NOT include previous day’s butterfly count
  • Test whether environmental variables cause fixed-magnitude changes regardless of population size
  • Example: “Strong wind reduces abundance by X butterflies”

B Models (n=55): Proportional/Density-Dependent Effects

  • Include butterflies_95th_percentile_t_1 as covariate or interaction term
  • Test whether environmental effects scale with baseline population
  • Example: “Strong wind reduces abundance by X% of the population”

5.2 Model Specifications

Within each family, models varied in:

  1. Fixed effects structure:
    • Single predictors vs. combinations
    • Linear terms vs. smooth terms (splines)
    • Main effects only vs. interactions
    • Different temperature metric combinations
  2. Random effects (all models):
    • Random intercept: deployment_id
    • Temporal autocorrelation: AR1 within deployments using day_sequence | deployment_id
  3. Correlation structures tested:
    • No temporal correlation (baseline)
    • AR1 correlation within deployments

5.3 Sample Model Formulas

Simple M models (absolute effects):

M1: butterfly_diff_95th_sqrt ~ 1
M2: butterfly_diff_95th_sqrt ~ wind_max_gust_t_1
M24: butterfly_diff_95th_sqrt ~ s(wind_max_gust_t_1)

Complex M models:

M34: butterfly_diff_95th_sqrt ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
M41: butterfly_diff_95th_sqrt ~ temp_at_max_count_t_1 * wind_max_gust_t_1

B models (proportional effects):

B1: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1
B19: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1
B31: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1)
B51: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1

6. Model Selection Results

Note: Full model fitting and results are documented in the analysis document.
This summary presents key findings for interpretation.
Total models tested: 105 (50 M models + 55 B models)
Each model tested with 2 correlation structures: no_corr and AR1
Total model fits: 210

6.1 Top 10 Models by AIC

The table below shows the 10 best-performing models. Note that all top models are B models (include baseline count), indicating that proportional/density-dependent effects provide substantially better fit than absolute effects.

Click to expand: Top 10 Models Table

Model naming convention: - First part (B19, B31, etc.): Model formula specification (see Appendix A) - Last part (AR1 or no_corr): Correlation structure used

Top 10 Models by AIC
Model Correlation AIC Delta_AIC AIC_weight df
B33_AR1 AR1 668.401 0.000 0.148 9
B29c_AR1 AR1 668.671 0.270 0.129 8
B28_AR1 AR1 669.101 0.700 0.104 7
B35_AR1 AR1 669.573 1.172 0.082 13
B37_AR1 AR1 669.594 1.193 0.081 15
B29_AR1 AR1 669.685 1.284 0.078 6
B34_AR1 AR1 670.016 1.615 0.066 11
B29a_AR1 AR1 670.504 2.103 0.052 7
B38_AR1 AR1 670.691 2.289 0.047 10
B29d_AR1 AR1 670.864 2.463 0.043 8

Key findings: - All top 10 models are B models (include baseline butterfly count) - All top 10 models use AR1 correlation structure - ΔAIC between best M model and best B model: >50 (see full table in Appendix B)

6.2 Key Findings from Model Selection

  1. Density-dependence is critical: B models vastly outperform M models (ΔAIC > 50)
    • Environmental effects scale with population size
    • Cannot understand day-to-day changes without accounting for starting population
  2. AR1 correlation structure preferred: Models with temporal autocorrelation fit better
    • Day-to-day changes are not independent
    • Consecutive days within deployments are correlated
  3. Non-linear relationships: Smooth term models generally outperform linear models
    • Relationships between environment and abundance change are complex
    • Simple linear effects inadequate to capture response patterns
  4. Model uncertainty: Multiple models with ΔAIC < 2 indicates:
    • No single “best” model stands out
    • Different variable combinations provide similar explanatory power
    • Results should be interpreted with appropriate uncertainty

6.3 Models with Strong Support (ΔAIC < 2)

Multiple models show strong empirical support, suggesting uncertainty in the precise form of environmental effects:

Number of models with ΔAIC < 2: 7 
Models with Strong Empirical Support (ΔAIC
Model Correlation Formula AIC Delta_AIC AIC_weight df
B33_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 668.4011 0.0000 0.1477 9
B29c_AR1 AR1 butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) 668.6711 0.2700 0.1291 8
B28_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1) 669.1010 0.6999 0.1041 7
B35_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 669.5730 1.1719 0.0822 13
B37_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 669.5941 1.1930 0.0814 15
B29_AR1 AR1 butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) 669.6853 1.2842 0.0777 6
B34_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 670.0163 1.6152 0.0659 11

7. Best Model Results

7.1 Model Specification

Best model: NA

Formula:

NA

Model structure: - Random effects: ~1 | deployment_id - Correlation: AR1(form = ~day_sequence | deployment_id) - Family: Gaussian

7.2 Model Performance

Best Model Performance Metrics
Metric Value
AIC NA
R_squared NA
Deviance_Explained NA
N_obs NA

7.3 Partial Response Curves

The plots below show the estimated effect of each predictor on changes in butterfly abundance, holding other variables constant. The y-axis represents the change in square-root transformed 95th percentile count.

Interpretation guide: - X-axis: Predictor variable value - Y-axis: Partial effect on butterfly_diff_95th_sqrt (change in abundance) - Shaded area: 95% confidence interval - Horizontal line at y=0: No change in abundance - Above zero: Positive effect (increase in abundance) - Below zero: Negative effect (decrease in abundance)

7.4 Results Notes

[INCOMPLETE SECTION - TO BE FILLED IN]

The partial response curves above show the relationships between each predictor and changes in butterfly abundance. The specific patterns and statistical significance vary across the different competitive models.

8. Model Diagnostics

Key diagnostic checks:

  1. Residuals vs. Fitted: Check for heteroscedasticity and non-linear patterns
  2. Q-Q Plot: Assess normality of residuals
  3. Scale-Location: Check for homogeneity of variance
  4. Residual Histogram: Overall distribution assessment

Notable improvement: Daily aggregation removed many counting artifacts present in 30-minute interval analysis residuals.

9. Discussion

9.1 Model Results Summary

Baseline abundance: - All well-supported models include previous day butterfly count - Environmental effects appear to be proportional to population size, not absolute

Wind effects: Wind showed up frequently in the top models, but 6 of the 10 had a delta AIC of less than 2. The wind model p values hover just above 0.05, but all had roughly similar curve shapes. If we interpret these curves, then wind has stronger support for influencing day to day clustering behavior, but not at all as predicted (cluster size increases the next day if max gust 3-5 m/s, and largest decreases are predicted at 0 m/s). This analysis has a much smaller dataset (100 pts), and my concern is that we are picking up some peculiar behavior at Spring Canyon in 2023, not detecting something general about monarch behavior.

9.2 Comparison with 30-Minute Interval Analysis

This daily-level analysis does not contradict our 30-minute interval analysis:

  • 30-minute analysis: No detectable wind effect on instantaneous counts
  • Daily analysis: Some evidence for wind effect on day-to-day persistence, with marginally significant p-values and complex curve shapes

9.3 Limitations and Caveats

Sample size: - Only ~100 day-pairs vs. thousands of observations in our 30-minute analysis - Weaker statistical power for detecting effects - Greater model uncertainty (multiple models with ΔAIC < 2)

Time horizon constraints: - 24-hour window chosen to balance biological plausibility with tracking uncertainty - I cannot follow individual butterflies; only observe population counts - Longer time horizons would introduce ambiguity about population identity

Wind effect interpretation: - Unexpected curve shape with largest decreases at zero wind - Single data point with zero wind day may influence pattern - Pattern may be specific to Spring Canyon 2023 rather than general

Residual model diagnostics: - While improved over 30-minute analysis, residuals still show some structure - AR1 correlation handles some but not all temporal dependence

9.4 Field Observations Context

Jessica Griffiths and I have both observed monarchs moving to leeward sides of trees in response to wind at Pismo. These short-term, within-site movements may not translate to the day-to-day changes in peak counts captured by this analysis, which could represent different behavioral processes at different temporal and spatial scales.

9.5 Future Considerations

Additional years of data would help stabilize partial effect estimates and assess year-to-year consistency. The single day with zero wind warrants investigation as it may influence the unexpected curve shape. Testing alternative time windows (48-hour, 72-hour) could reveal whether effects accumulate over multiple days.

10. Conclusions

This daily-level analysis provides a complementary perspective to our 30-minute interval analysis:

  1. Baseline population size is critical: Day-to-day changes in abundance cannot be understood without accounting for the starting population (all top models are B models)

  2. Complex environmental relationships: Non-linear smooth terms outperform linear effects

  3. Wind effects: Some evidence for wind effects at daily timescales with marginally significant p-values and unexpected curve shapes that may be influenced by limited data

  4. Complementary to prior analysis: Results do not contradict our 30-minute interval findings - different temporal scales may capture different processes

  5. Sample size considerations: Limited data (~100 observations) with multiple competitive models (ΔAIC < 2) indicating model uncertainty

The analysis suggests environmental effects on monarch abundance are complex and scale-dependent. Wind showed up frequently in top models but with patterns that differ from predictions, potentially reflecting site-specific patterns at Spring Canyon 2023 rather than general monarch behavior.


Appendix A: Complete Model Specifications

Click to expand: All 105 model formulas

M Models (Absolute Effects, n=50)

Models without previous day butterfly count, testing absolute environmental effects:

Null and single predictor models: - M1: ~ 1 (null model) - M2: ~ wind_max_gust_t_1 - M3: ~ temp_max_t_1 - M4: ~ temp_min_t_1 - M5: ~ temp_at_max_count_t_1 - M6: ~ sum_butterflies_direct_sun_t_1

Temperature combinations (linear): - M8: ~ temp_max_t_1 + temp_min_t_1 - M9: ~ temp_max_t_1 + temp_at_max_count_t_1 - M10: ~ temp_min_t_1 + temp_at_max_count_t_1 - M11: ~ temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1

Two-variable combinations: - M12: ~ wind_max_gust_t_1 + temp_max_t_1 - M13: ~ wind_max_gust_t_1 + temp_min_t_1 - M14: ~ wind_max_gust_t_1 + temp_at_max_count_t_1 - M15: ~ wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M16: ~ temp_at_max_count_t_1 + sum_butterflies_direct_sun_t_1

Full models with various temperature specs (linear): - M17: ~ temp_max_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M18: ~ temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M19: ~ temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M20: ~ temp_max_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M21: ~ temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1

Single smooth terms: - M24: ~ s(wind_max_gust_t_1) - M25: ~ s(temp_max_t_1) - M26: ~ s(temp_min_t_1) - M27: ~ s(temp_at_max_count_t_1) - M28: ~ s(sum_butterflies_direct_sun_t_1)

Smooth term combinations: - M30: ~ s(temp_max_t_1) + s(temp_min_t_1) - M31: ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) - M32: ~ s(temp_at_max_count_t_1) + s(sum_butterflies_direct_sun_t_1) - M33: ~ s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Complex smooth models: - M34: ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M35: ~ s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M37: ~ s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Mixed linear and smooth: - M38: ~ temp_at_max_count_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M39: ~ s(temp_at_max_count_t_1) + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M40: ~ s(temp_at_max_count_t_1) + wind_max_gust_t_1 + s(sum_butterflies_direct_sun_t_1)

Interactions: - M41: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 - M42: ~ temp_at_max_count_t_1 * sum_butterflies_direct_sun_t_1 - M43: ~ wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - M44: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M45: ~ temp_at_max_count_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - M46: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1

Temperature range models: - M47: ~ I(temp_max_t_1 - temp_min_t_1) - M48: ~ I(temp_max_t_1 - temp_min_t_1) + wind_max_gust_t_1 - M49: ~ s(I(temp_max_t_1 - temp_min_t_1)) - M50: ~ s(I(temp_max_t_1 - temp_min_t_1)) + s(wind_max_gust_t_1)

B Models (Proportional Effects, n=55)

All models include butterflies_95th_percentile_t_1 to test density-dependent effects:

Baseline-only: - B1: ~ butterflies_95th_percentile_t_1

Single predictor + baseline (linear): - B2: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 - B3: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 - B4: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 - B5: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 - B6: ~ butterflies_95th_percentile_t_1 + sum_butterflies_direct_sun_t_1

Temperature combinations + baseline: - B8: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 - B9: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_at_max_count_t_1 - B10: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 + temp_at_max_count_t_1 - B11: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1

Two-variable combinations + baseline: - B12: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_max_t_1 - B13: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_min_t_1 - B14: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_at_max_count_t_1 - B15: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B16: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + sum_butterflies_direct_sun_t_1

Full models + baseline: - B17: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B18: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B19: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B20: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B21: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1

Single smooth + baseline: - B24: ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) - B25: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) - B26: ~ butterflies_95th_percentile_t_1 + s(temp_min_t_1) - B27: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) - B28: ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1)

Smooth baseline variations: - B29: ~ s(butterflies_95th_percentile_t_1) - B29a: ~ s(butterflies_95th_percentile_t_1) + wind_max_gust_t_1 - B29b: ~ s(butterflies_95th_percentile_t_1) + temp_at_max_count_t_1 - B29c: ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) - B29d: ~ s(butterflies_95th_percentile_t_1) + s(temp_at_max_count_t_1)

Smooth combinations + baseline: - B30: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) - B31: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) - B32: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(sum_butterflies_direct_sun_t_1) - B33: ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Complex smooth + baseline: - B34: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B35: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B37: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Mixed linear and smooth + baseline: - B38: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B39: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B40: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + wind_max_gust_t_1 + s(sum_butterflies_direct_sun_t_1)

Interactions + baseline: - B41: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 - B42: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * sum_butterflies_direct_sun_t_1 - B43: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - B44: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B45: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - B46: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1

Temperature range + baseline: - B47: ~ butterflies_95th_percentile_t_1 + I(temp_max_t_1 - temp_min_t_1) - B48: ~ butterflies_95th_percentile_t_1 + I(temp_max_t_1 - temp_min_t_1) + wind_max_gust_t_1 - B49: ~ butterflies_95th_percentile_t_1 + s(I(temp_max_t_1 - temp_min_t_1)) - B50: ~ butterflies_95th_percentile_t_1 + s(I(temp_max_t_1 - temp_min_t_1)) + s(wind_max_gust_t_1)

Interactions with baseline (density-dependent environmental effects): - B51: ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1 - B52: ~ butterflies_95th_percentile_t_1 * temp_at_max_count_t_1 - B53: ~ butterflies_95th_percentile_t_1 * sum_butterflies_direct_sun_t_1 - B54: ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1 + temp_at_max_count_t_1 - B55: ~ butterflies_95th_percentile_t_1 * temp_at_max_count_t_1 + wind_max_gust_t_1

Note: All models include random intercept for deployment_id and were tested with both no correlation and AR1 correlation structures.

Appendix B: Full AIC Table

Click to expand: Complete AIC rankings for all 210 model fits

Summary statistics:

Total models fitted: 200 
Models with ΔAIC < 2: 7 
Models with ΔAIC < 4: 13 
Models with ΔAIC < 10: 32 
Best M model (no baseline):
  Model: M34_AR1 
  ΔAIC: 14.07 
Best B model (with baseline):
  Model: B33_AR1 
  ΔAIC: 0 

Document prepared: 2025-09-30 Analysis code: analysis/monarch_daily_gam_analysis.qmd Data source: data/monarch_daily_lag_analysis.csv Data preparation: data_prep_daily_lag.py